Conference Proceedings
Tailoring the Shapley Value for In-Context Example Selection Towards Data Wrangling
Z Liang, H Wang, X Ding, Z Liang, C Liang, Y Tang, J Qi
Proceedings International Conference on Data Engineering | IEEE | Published : 2025
Abstract
Data wrangling (DW) is a fundamental step to prepare data for downstream mining tasks. Recent studies explore large language models (LLMs) to form a lightweight DW paradigm. Such studies typically require prompting an LLM with a DW task together with a few examples as task demonstrations (i.e., in-context learning). A problem yet to be explored is how to select the examples, to maximize task effectiveness given constraints on the size of the examples. To fill this gap, we introduce the constrained Shapley value (CSV), a tailored variant of the Shapley value with a constraint on the LLM prompt size, to guide example selection. We show that CSV has desirable properties in example importance es..
View full abstractRelated Projects (1)
Grants
Awarded by National Natural Science Foundation of China